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UNCLASSIEIED 


Diagnostic  Reasoning  with 
Multilevel  Set-Covering  Models 

Joachim  Baumeister  ^ and  Dietmar  Seipel  ^ 


Abstract.  We  consider  multilevel  set-covering  models  for  diagnos- 
tic reasoning:  though  a lot  of  work  has  been  done  in  this  field,  knowl- 
edge aequisition  efforts  have  been  investigated  only  insufficiently. 
We  will  show  how  set-covering  models  can  be  build  incrementally 
and  how  they  can  be  refined  by  knowledge  enhancements  or  repre- 
sentational extensions.  All  these  extensions  have  a primaiy  charac- 
teristic: they  can  be  applied  without  changing  the  basic  semantics  of 
the  model. 

Keywords:  set-covering  diagnosis;  model-based  diagnosis;  qualita- 
tive modeling;  knowledge  acquisition;  abductive  reasoning 

1 Introduction 

In  this  paper  we  will  present  a new  interpretation  of  set-covering 
models  [1]  which  is  a suitable  representation  for  the  manual  devel- 
opment of  knowledge-based  systems.  Because  of  its  simple  seman- 
tics set-covering  models  are  rapidly  understood  by  the  experts,  but 
still  maintain  a well-known  model-based  interpretation.  In  [2]  we 
showed  how  knowledge-based  diagnostic  systems  can  be  developed 
incrementally  with  set-covering  models,  thus  supporting  rapid  pro- 
totyping of  such  systems.  In  this  paper  we  will  extend  this  approach 
to  multilevel  set-covering  models,  and  we  will  describe  how  simple 
set-covering  models  can  be  enhanced  by  representational  extensions. 
Practical  experience  has  shown  that  these  additions  facilitated  the  de- 
velopment of  a real  world  example  from  a medical  ICU  domain. 

A set-covering  model  consists  of  a set  of  diagnoses,  a set  of  find- 
ings (observations)  and  covering  relations  between  the  elements  of 
these  two  sets.  There  exists  a covering  relation  between  a diagnosis 
and  a finding,  iff  the  diagnosis  implies  the  observation  of  the  find- 
ing. We  can  define  covering  relations  between  diagnoses  as  well,  iff 
a diagnosis  implies  the  observation  of  another  diagnosis.  The  basic 
idea  of  set-covering  diagnosis  is  the  detection  of  a reasonable  set  of 
diagnoses  which  can  explain  the  given  observations.  To  do  this,  we 
propose  an  abductive  reasoning  step:  Firstly,  hypotheses  are  gener- 
ated in  order  to  explain  the  given  observations.  Secondly  competing 
hypotheses  are  ranked  using  a quality  measure. 

Reasoning  with  set-covering  models  has  got  a long  tradition  in  di- 
agnostic reasoning:  Early  work  was  done  by  Patil  [3]  with  his  sys- 
tem ABEL,  which  implemented  a comprehensive  set-covering  rep- 
resentation including  causal,  associational  and  grouping  relations. 
Reggia  et  al.  [1]  contributed  a formal  approach  to  set-covering  mod- 
els and  addressed  the  problem  of  hypothesis  generation  with  a pre- 
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cise  description  of  generator  sets.  Later  [4]  they  introduced  the  in- 
tegration of  Bayesian  probabilities  in  set-covering  models.  With  the 
system  MOLE  [5]  Eshelman  focussed  on  the  problem  of  acquiring 
set-covering  knowledge.  He  proposed  an  interactive  process  that  al- 
lows for  refining  previously  acquired  knowledge  after  a reasoning 
step  to  differentiate  between  conflicting  hypotheses.  Console  et  al. 
[6]  showed  with  the  system  CHECK  how  to  combine  heuristic  and 
causal  knowledge.  There  heuristic  knowledge  was  used  to  find  rea- 
sonable hypotheses  for  a given  observation.  In  a second  step  the 
causal  knowledge  was  used  to  generate  abductive  explanations  for 
the  hypotheses.  Long  [7]  extended  covering  models  with  probabili- 
ties and  a rich  syntax  of  temporal  and  non-temporal  causation  events. 
Since  knowledge  acquisition  is  a cost  sensitive  task,  reuse  of  existing 
knowledge  is  another  emerging  aspect.  Puppe  [8]  showed  how  set- 
covering knowledge  can  be  combined  with  other  classes  of  knowl- 
edge like  heuristic  rules,  case-based  knowledge  or  decision  trees. 

Most  of  these  approaches  only  investigated  syntax  and  semantics  of 
the  reasoning  process,  but  did  not  consider  the  knowledge  engineer- 
ing process.  Eshelman ’s  MOLE  system  [5]  differs  from  our  knowl- 
edge acquisition  approach,  since  there  knowledge  refinement  is  per- 
formed by  adding  new  covering  relations  to  the  model.  In  our  paper 
we  will  present  (multilevel)  set-covering  models  and  show  how  to 
enrich  these  simple  models  with  knowledge  enhancements  like  simi- 
larities and  weights  or  representational  extensions  for  more  complex 
covering  relations.  A primary  characteristic  of  the  presented  exten- 
sions is  the  incrementality:  each  extension  can  be  applied  indepen- 
dently from  other  enhancements  and  will  not  change  the  basic  se- 
mantics of  the  model,  but  refine  special  aspects  of  it. 

The  rest  of  the  paper  is  organized  as  follows:  In  Section  2 we  will 
introduce  the  basic  concepts  of  set-covering  models  and  show  how 
to  enrich  set-covering  models  with  additional  knowledge  like  simi- 
larities and  weights.  Beyond  that  we  will  introduce  representational 
extensions  of  set-covering  models  in  Section  3 that  will  enable  us  to 
formulate  exclusions,  necessary  relations  and  complex  covering  rela- 
tions (conjunctions,  disjunctions,  cardinalities).  In  Section  4 we  will 
shortly  summarize  the  problem  of  hypothesis  generation  and  we  will 
introduce  constraints  that  shrink  the  exponentiell  size  of  possible  hy- 
potheses. We  will  conclude  this  paper  in  Section  5 with  an  overview 
of  the  work  we  have  done  so  far  and  promising  directions  we  are 
planning  to  work  on  in  the  future. 

2 Set-Covering  Models 

A set-covering  model  consists  of  a set  of  diagnoses,  a set  of  findings 
(observations)  and  covering  relations  between  the  elements  of  these 
two  sets.  There  exists  a covering  relation  between  a diagnosis  and 


a finding,  iff  the  diagnosis  predicts  the  observation  of  the  finding. 
Furthermore  we  can  define  covering  relations  between  two  diagnoses 
to  state  that  a diagnosis  implies  another  diagnosis.  In  this  way  we 
can  build  a covering-tree  for  a diagnosis,  where  we  postulate  that 
the  leafs  of  the  covering-tree  have  to  be  observable  findings.  So  each 
covering  path  will  start  with  a diagnosis  and  lead  to  an  observable 
finding. 

2.1  The  Basic  Model 


In  the  worst  case  this  procedure  will  generate  2”  candidates  for  n 
diagnoses.  So  heuristics  are  needed  to  keep  the  method  computation- 
ally tractable  (c.f.  Section  4). 

The  basic  sets  for  this  task  are  the  following:  We  define  fl©  to  be 
the  set  of  all  diagnoses  and  Q-a  the  set  of  all  observable  parameters 
(attributes).  To  each  parameter  A € Q.a  a range  dom(A)  of  values 
is  assigned,  and  Qy  = Uacsia  is  the  set  of  all  possible 

values  for  the  parameters.  If  a parameter  A is  assigned  to  a value  v, 
then  we  call  T : n a finding. 


The  basic  idea  of  set-covering  diagnosis  is  the  detection  of  a reason- 
able set  of  diagnoses  which  can  explain  the  given  observation  of  find- 
ings. In  an  abductive  reMoning  step  hypotheses  are  firstly  generated 
in  order  to  explain  the  given  observations  {hypothesis  generation).  In 
a second  step,  we  define  a quality  measure  for  ranking  competing  hy- 
potheses {hypothesis  te.sting).  Set-covering  models  describe  relations 
like: 

A diagnosis  D predicts  that  the  parameters  Ai,. . . ,An  are 
observed  with  corresponding  values  vi, ...  ,v„. 

A diagnosis  D predicts  the  diagnoses  Di, ... , Dm. 


We  call  each  of  these  relations  covering  relations  and  we  denote  them 
by 

n = D —f  Ai’.v.i,  1 < i < n, 
r'l  = D ^ Di,  I <i  <m. 

Covering  models  can  be  visually  described  like  in  Figure  1.  In  this 


Figure  1.  Basic  set-covering  model  for  diagnoses  Flu,  Fever  and  Cold. 


example  the  model  states  that  diagnosis  Flu  implies  the  observation 
of  the  diagnoses  Fever  and  Cold.  Diagnosis  Fever  itself  forces  the 
observation  of  the  attributes  Temperature  and  Skin  with  their  corre- 
sponding values  Increased  and  Sweating. 

The  basic  algorithm  for  set-covering  diagnosis  is  very  simple:  Given 
a set  of  observed  findings,  it  uses  a simple  hypothesize-and-test  strat- 
egy, which  generates  hypotheses  (coined  from  diagnoses)  in  the  first 
step  and  tests  them  against  the  given  observations  in  a second  step. 
The  test  is  defined  by  calculating  a quality  measure,  which  expresses 
the  covering  degree  of  the  hypothesis  regarding  the  observed  find- 
ings. The  generation  and  evaluation  of  the  hypotheses  is  an  iterative 
process,  which  stops  when  a satisfying  hypothesis  has  been  found  or 
all  hypotheses  have  been  considered.  Usually  the  algorithm  will  look 
at  single  diagnoses,  compute  the  corresponding  quality  measure,  and 
then  it  will  generate  hypotheses  with  multiple  diagnoses,  if  needed. 


Qjr  = { A:v\  A c Cl  A , V 6 dom(A)  } 


is  the  set  of  all  findings.  Furthermore  we  call  an  element  S e Cls  = 
Qt>  U a .state. 

A covering  relation  r between  a diagnosis  D and  a state  S {S  fi  D) 
is  denoted  by  r = D — > 5.  We  say  that  “D  predicts  S”  or  that  “D 
covers  S”.  Then  c,-  = D is  called  the  cause  and  Cr  = 5 is  called  the 
efi'ect.  We  define  Q-r  to  be  the  set  of  all  covering  relations  contained 
in  the  model.  Then  D^  6 Cl'iz  is  the  set  of  all  covering  relations  with 
diagnosis  D as  the  cause,  i.e.  D^  = {r  6 Qr  | c,.  = D}.  E.g., 
for  the  model  in  Figure  1 we  obtain  Crj  = Flu  and  Cn  = Fever, 
Cold*  = {r5,re}. 

Since  S can  be  a diagnosis  itself,  we  are  able  to  build  multilevel  set- 
covering models.  A state  S tran.sitively  covers  another  state  S' , if 
either  S covers  S'  or  S covers  another  state  S"  that  transitively  cov- 
ers S' . 


We  call  Fo  C the  set  of  observed  findings  and  a set  K C Civ 
of  diagnoses  a hypothesis.  A finding  that  is  not  transitively  covered 
by  the  hypothesis  X is  called  isolated,  and  the  set  of  all  observed 
findings  that  are  isolated  will  be  denoted  by  c Fo-  E.g.  for 

a hypothesis  H = {Di}  and  Fo  = {Ai  : ui,  A2  : r>2,  A4  : U4}  we 
obtain  = {A2-.V2}. 


Figure  2.  Basic  set-covering  model  for  diagnosis  D 


Now  we  will  describe  the  computation  of  the  precision  of  a state  for 
a given  observation.  The  precision  r{S)  of  a state  S provides  a real 
value  between  0 and  1 to  describe  the  degree  of  accuracy  the  covered 
states  of  S are  observed. 

Bottom-Up  Computation  of  Precisions.  Given  the  set  Fo  of  ob- 
served findings,  the  precision  tv  of  each  state  is  computed  bottom-up 
starting  with  the  findings: 


7r(A:t;) 


1,  if  A:v  e Fo 
0,  otherwise 


(1) 


The  precision  n(D)  of  a diagnosis  D can  be  computed  as  soon  as  the 
precisions  of  all  its  successors  S are  known.  For  this  we  define 

Dye  = {r  e \ 7r(e,.)  > c(e,.)  }, 

D+o  = {reD+|7r(e,.)>0}, 

as  the  sets  of  all  relevant  covering  relations,  i.e,  relations  that  predict 
states  with  a precision  greater  than  a user  defined  threshold  function. 

E 

r€D  + 

AD)  = |n+  I . ifDt„  + 0 (2) 

0,  otherwise 

The  denominator  counts  all  successor  states  of  D with  a positive  pre- 
cision, which  gives  us  the  maximally  achievable  score.  The  nomina- 
tor sums  up  the  precision  of  all  successor  states  with  a precision,  that 
is  greater  than  or  equal  to  the  completeness  value,  which  gives  us  the 
actually  achieved  score. 

The  completeness  value  c(D)  of  a diagnosis  is  specified  by  the  mod- 
eler and  is  motivated  by  the  fact,  that  a covering  model  for  a diagnosis 
will  contain  more  states  than  the  diagnosis  will  cause  in  an  average 
case.  Nevertheless  in  most  cases  the  observation  of  a percentage  of 
the  modeled  states  will  legitimate  the  validation  of  this  diagnosis.  To 
emphasize  this  percentage  the  modeler  has  to  specify  a completeness 
value  c(D).  Unless  this  factor  is  reached  by  the  observation  set  in 
the  current  case,  the  diagnosis  may  neither  be  considered  as  a validly 
observed  state,  nor  will  it  be  considered  as  a valid  hypothesis  candi- 
date. 

Since  we  also  want  to  consider  multiple  faults,  i.e.  hypotheses  con- 
taining more  than  one  diagnosis,  we  define 

n-^=  [J  D+  nto  = U nte  = U 

D£H  D£H  D&n 

The  covering  relations  r 6 are  called  relevant  for  H.  Observe, 
that  relevancy  depends  on  To,  since  the  precisions  have  been  com- 
puted based  on  To- 

Quality  Measures.  The  quality  measures  are  used  to  rank  the  possi- 
ble hypotheses  with  respect  to  the  given  observation.  As  we  already 
introduced  the  precision  of  a single  diagnosis  we  now  will  define 
the  quality  of  a hypothesis,  which  can  contain  multiple  diagnoses. 
The  quality  of  a hypothesis  provides  a real  value  between  0 and  1 
to  describe  the  degree  of  accuracy  with  which  the  hypothesis  H can 
explain  the  given  observation  To- 

Definition  2.1  (Quality  Measure)  The  quality  ^>(W)  of  a hypothe- 
sis Ti.  is  given  by 


if  all  predictions  are  fully  observed,  i.e.  WE  = > ^nd  the  set 

r^isolated  fk 

'^u,o  — 

Example.  For  the  covering  relation  given  in  Figure  2,  the  set 

To  = { A-i’-v-s,  Ai'.Vi,  Af,:V5,A<j:V(,  } 

of  findings,  and  the  hypothe.sis  H = {£>i},  we  obtain  niD^)  = 1, 
t!^{Ds)  = 1 (with  c{D-i)  = c(D-i)  = 0.7).  Since  we  obtain  VT  = 
{ri , T2,  T3  } for  hypothesis  Ti.  we  can  calculate 

nte  = {n,r2}, 

rT'isolated  f 4 4 1 

Tn.o  = {A2--V2,A3:v,i}- 

Up  to  now  we  presented  the  basic  representation  for  set-covering 
models  containing  diagnoses  and  findings  connected  with  cover- 
ing relations.  Of  course  this  simple  representation  might  not  always 
meet  the  requirements  of  real  world  applications.  Therefore  we  will 
shortly  present  knowledge  extensions  of  set-covering  models.  In  [2] 
we  showed  how  to  apply  these  extensions  in  an  incremental  way. 

2.2  Extension  by  Similarities  and  Weights 

Similarities  between  findings  and  weights  for  states  provide  signifi- 
cant knowledge  extensions  for  set-covering  models.  In  the  following 
we  will  show  how  to  include  these  enhancements  into  the  quality 
measures  given  above. 

Similarities.  Consider  a parameter  A with  the  domain 
dom(A)  = {no,  si,  mi,  hi}, 

with  the  meanings  normal  (no),  slightly  increased  (si),  medium  in- 
creased (mi),  and  heavily  increased  (hi),  where  A : hi  is  predicted. 
We  clearly  see  that  the  observation  A : m,i  deserves  a better  precision 
than  the  observation  A:no-  Nevertheless  the  simple  quality  measure 
considers  both  observations  as  unexplained  findings  and  makes  no 
difference  between  the  similarities  of  the  parameter  values.  For  this 
reason  we  want  to  define  -nmilarities  as  an  extension  to  set-covering 
models. 

We  define  the  similarity  function 

sim  : Qv  X — > [0, 1] 

to  capture  the  similarity  between  two  values  assigned  to  the  same 
parameter.  The  value  0 means  no  similarity  and  the  value  1 indicates 
two  equal  values.  In  cluster  analysis  problems  this  function  is  also 
called  distance  function  (cf  [9]). 

With  similarities  we  need  to  adapt  Equation  (1)  for  computing  the 
precision  of  findings. 


Notice  that,  in  contrast  to  the  precision,  the  quality  measure  does  not 
evaluate  a single  diagnosis  with  respect  to  the  transitively  observed 
predictions,  but  assesses  a hypothesis  (containing  possibly  multiple 
diagnoses)  on  the  basis  of  the  transitively  predicted  and  observed 
findings  and  the  unexplained  (isolated)  findings. 

We  see  that  p(7f)  6 [0, 1]  for  any  hypothesis  H 6 Q-h'-  The  lower 
bound  0 is  obtained,  if  Ti-te  = ^he  upper  bound  1 is  obtained. 


7t{A:v)  = sim{Val-H(A),  Valj^^(A)) , 

where  Vd(  returns  the  value  of  a specified  attribute  contained  in  a 
specified  set  of  states. 

Val  ■-  X Q. A 

If  no  special  similarity  is  included  in  the  model,  then  we  get  the  sim- 
ple quality  measure  by  defining  the  default  similarity  sim(v,  v')  = 
where  = 1,  if  w = v',  and  = 0,  otherwise. 

Weights.  The  introduction  of  weights  for  covered  states  is  another 
common  generalization  of  the  basic  covering  model.  Here  we  apply 


a weight  function  w : Qs  ^ , to  emphasize  that  some  states 

(findings  and  diagnoses)  have  a more  significant  pathological  impor- 
tance than  other  states. 

When  applying  weights  to  the  model  we  need  to  adapt  Equation  (2) 
which  calculates  the  precision  for  a given  diagnosis: 


E w(e,.)  • 7r(er) 


-k{D)  = 


w{er) 


I 0. 


otherwise. 


Like  for  the  precision  of  a diagnosis,  we  need  to  adapt  Equation  (3) 
to  calculate  the  quality  of  a given  hypothesis: 


E w{er)  ■ 7r(e,.) 

rewE 

= E ^Cer)+  E vKF) 

JP(Z  Ttsolali'd 

If  all  states  have  the  same  weight,  i.e.,  w{S)  = 1 for  all  S e Qs, 
then  the  model  reduces  to  the  simple  covering  model. 

In  addition  to  similarities  and  weights  we  already  have  introduced 
uncertain  covering  relations  and  causal  effect  functions  as  possible 
extensions  (cf  [2]). 


Then  the  weights  of  the  AND-connected  findings  if;  will  only  con- 
tribute to  the  precision  of  D if  all  of  these  findings  are  observed. 
If  not  all  findings  are  observed,  then  D cannot  explain  the  findings 
and  we  have  to  check  if  another  diagnosis  from  the  hypothesis  can 
explain  these  observations.  All  remaining  findings  - so  far  unex- 
plained - will  be  added  to  the  set  of  isolated  findings  This 

will  decrease  the  quality  measure  for  the  current  hypothesis,  since 
"HE  "^11  not  contain  relations  covering  the  unexplained  observa- 
tions. Given  an  AND-covering  relation  of  the  form 

T — D — >AND  {El,  ■ ■ ■ , Fri  } 

we  define  for  each  F;  6 { E,  . . . , F„  }: 

ttEF,)  = I ^ ^ ° 

1 0,  otherwise 

We  try  to  explain  all  findings  F;  with  %r{Fi)  = 0 but  7r(F;)  > 0 by 
other  diagnoses  D'  6 H\  {D}.  All  remaining  findings  Fi,  which 
cannot  be  explained  by  other  diagnoses  are  added  to  ■ 

Example.  Assume  that  we  have  the  covering  model  of  Figure  3, 
where  c{D)  = 0.5,  and  we  observe  the  set  Fo  = {fi,^2}. 
Then  7t(F3)  = 0,  since  F3  is  not  in  Fo-  Therefore  not  all  preci- 
sions of  the  AND-covered  findings  are  greater  than  0,  and  we  define 
7Tr(F2)  = 7r,.(f;i)  = 0.  We  obtain  = {^2}  for  hypothesis 

'H  = {D}.  Notice,  that  F3  is  not  in  F^°o*‘’''\  since  it  is  not  observed. 


3 Complex  Covering  Relations 

In  the  previous  section  we  introduced  the  basic  set-covering  model 
and  extensions  that  allow  for  the  refinement  of  set-covering  knowl- 
edge build  with  basic  covering  relations.  In  this  section  we  propose 
some  further  extensions  of  the  representation,  And-,  Or-  and  [MlN, 
MAX]-relations. 

To  keep  the  interpretation  of  covering  models  simple,  we  only  al- 
low these  extensions  for  covering  relations  between  diagnoses  and 
(directly  observable)  findings. 

3.1  Conjunction  of  Covering  Relations 

It  is  desirable  to  be  able  to  represent  conjunctions  between  covering 
relations.  An  AND-covering  relation 

D — ^and  {El , • • • , } 

denotes  the  characteristic,  that  all  covering  relations  D ^ Fi  have 
to  be  fulfilled  simultaneously. 


Figure  3.  Covering  relation  D ^and  {F2,  F3} 


3.2  Disjunction  of  Covering  Relations 

We  also  can  express  alternative  covering  relations  with  disjunc- 
tion. Here  we  can  distinguish  between  inclusive  (OR)  and  exclusive 
(XOR)  disjunctions. 

In  Figure  4 we  can  see  two  different  disjunctive  covering  relations 
for  diagnosis  D:  in  the  left  one  the  findings  F2 , F3  are  connected 
with  the  OR-covering  relation  D — >or  {F2,F;i},  whereas  at  the 
right  side  the  findings  are  connected  with  an  XOR-covering  relation 
D -^xoR  {1^2,  ^3}.  These  OR/XOR-relations  state,  that  only  one 


Figure  4.  OR-/XOR-covering  relations. 


of  the  connected  finding  has  to  be  observed  to  fulfill  the  relation. 
Of  course  we  need  to  consider  the  different  semantics  in  covering 
models.  When  computing  the  quality  measures  we  have  to  take  the 
following  three  cases  into  account: 

1.  If  none  of  the  predicted  findings  is  observed,  then  nothing  has 
to  be  done.  The  covering  relations  connected  with  the  Or/Xor- 
condition  cannot  contribute  to  the  quality  measure  of  the  parent 
state. 


2.  If  one  of  the  predicted  findings  is  observed,  then  we  simply  cut 
all  other  states  connected  by  OR/XoR-relations  from  the  model. 
When  computing  the  quality  measure  we  only  take  the  observed 
finding  into  account. 

3.  If  more  than  one  of  the  predicted  findings  are  observed  (e.g. 
{j?2,  Fs}  C To),  then  we  have  to  differentiate  between  Or  and 
XOR  relations.  For  both  we  take  the  finding  with  the  maximal  con- 
tribution; e.g.  regarding  the  weighted  precision 

iT'w{F)  = 7t(F)  • w{F). 

For  OR-relations  we  simply  ignore  the  remaining  observations  for 
assessing  the  quality.  They  will  neither  contribute  to  the  quality  of 
the  hypothesis  nor  will  they  need  to  be  explained  by  other  diag- 
noses. 

For  XoR-relations  the  observations  left  over  still  have  to  be  ex- 
plained. Like  for  the  AND-relations  we  tiy  to  explain  them  with 
the  other  diagnoses  contained  in  the  current  hypothesis.  All  re- 
maining findings,  that  cannot  be  explained  by  other  diagnoses,  are 
added  to  the  set  of  isolated  findings 

We  see  that  we  carefully  have  to  use  OR/XOR-relations,  because  of 
their  different  interpretation  of  the  observation.  For  example,  multi- 
ple observations  of  one  XOR-covering  relation  are  taken  negatively 
into  account  (i.e.,  they  are  assumed  to  be  unexplained  findings  of  the 
current  hypothesis),  whereas  in  ordinary  OR-relations  they  will  not 
contribute  in  any  way. 

As  shown  for  AND-covering  relations  we  also  have  to  locally  define 
the  precision  for  OR/XOR-covered  findings  in  context  of  the  given 
diagnosis:  Consider  an  OR-relation  (analogous  for  XOR): 

T = D — ^or  { .^1,  • • • , Fn  }. 

We  select  a finding  F^ax  6 { J’l,  • • • , fn  },  such  that  7r.u,(F„„„)  = 
max(TTw{Fi),  1 <i  < n).  Then  we  say  that 

I 0,  otherwise. 

If  there  is  more  than  one  Ft  with  maximum  weighted  precision 
Ttw{Fi),  then  all  but  one  (randomly  selected)  finding  will  set  to  the 
precision  7r,.(Fi)  = 0. 

When  we  compute  the  precision  7r(r>)  of  a diagnosis  D,  then  the  pre- 
cisions of  the  findings  F that  are  covered  by  an  OR/XOR-covering 
relation  contribute  with  the  measure  7Tr(F)  and  not  with  the  usual 
precision  measure  7t(F). 

For  XOR-relations  we  have  to  explain  the  remaining  findings  by  other 
diagnoses  contained  in  the  hypothesis  or  add  them  to 


3.3  Cardinalities  in  Covering  Relations 

Another  enrichment  of  the  set-covering  representation  is  the  connec- 
tion of  covering  relations  by  cardinality  constraints.  We  express  such 
cardinalities  by  [Min,  MAX]-covering  relations.  Consider  the  exam- 
ple in  Figure  5.  The  covering  relation  between  diagnosis  D and  the 
findings  F,  F,  Fs,  F and  F means,  that  between  2 and  4 of  the 
predicted  findings  have  to  be  observed.  We  denote  such  relations  by 

r = D —>[2,4]  { F,  F,  Fr,  F,  F }• 

When  we  interpret  [MlN,  MAX]-relations  r =D  — >[min.max]  F then 
we  have  to  consider  three  possible  cases  for  the  number  k = \F 
Fo  \ of  relevant  findings: 


Figure  5.  A [Min,  MAX]-covering  relation. 


1.  If  A:  6 [Min,  Max],  then  all  findings  in  F fl  Fo  will  contribute. 

2.  If  fc  > Max,  then  let  Fmax  Q F C\  Fo  be  the  Max  findings 
with  the  maximum  weighted  precisions  among  the  findings  in  F 
(i.e.  \Fmax\  = Max).  We  explain  the  findings  in  Fmax  by  D. 
Then  we  try  to  explain  the  findings  in  {F  n Fo)  \ Fmax  by  other 
diagnoses  also  contained  in  the  hypothesis.  These  findings  {F  IT 
Fo)  \ Fmax,  which  we  cannot  explain  by  other  diagnoses  D'  6 
n \ {D},  are  added  to 

3.  If  A;  < Min,  then  we  try  to  explain  all  findings  in  FClFo  by  other 
diagnoses  D'  € W \ {D}.  Findings,  which  cannot  be  explained 
by  other  diagnoses,  are  added  to  F)^°o*^“‘. 

We  integrate  [MlN,  MAX]-relations  into  set-covering  models  by  lo- 
cally defining  the  precision  for  findings  connected  by  a [MlN,  MAX]- 
relation  r =D  — >[MrN,MAx]  F.  Then  we  say  that  for  each  F e F: 

'0,  if  A:  < Min 

, or  if  A:  > Max  A F ^ Fmax 

I 7t(F'),  if  A:  € [Min,  Max] 

[ or  if  A > Max  A F 6 Fmax 

where  Fmax  is  again  the  set  of  the  Max  findings  with  the  best 
weighted  precisions  among  the  findings  in  F. 

When  calculating  the  quality  measure  for  a diagnosis  or  hypothesis 
we  apply  the  precision  7r,  (F)  for  all  findings  F connected  by  the 
relation  r.  Findings  F with  7t,  (F)  = 0 but  7t(F)  > 0 need  to  be 
explained  by  other  diagnoses  contained  in  the  hypothesis  or  will  be 
added  to 

It  is  worth  mentioning  that  ordinary  covering  relations  for  a diagno- 
sis are  following  a similar  concept,  since  we  only  will  consider  pre- 
dicted findings  that  are  also  observed  but  not  all  predicted  findings  of 
the  diagnosis.  But  as  opposed  to  [Min,  MAX]-relations  all  observed 
predictions  will  contribute  to  the  quality.  In  [MlN,  MAX]-relations 
only  Max  observed  findings  will  contribute;  more  than  MAX  find- 
ings have  to  be  explained  by  other  diagnoses.  In  general,  an  ordinary 
covering  model  for  a diagnosis  D with  n covered  findings  is  compa- 
rable to  a [c(D)  ■ n,  T),]-relation  connecting  the  n findings. 


3.4  Bounded  Covering  Relations 

The  introduction  of  similarities  for  finding  values  is  a useful  knowl- 
edge extension.  Nevertheless  in  some  situations  the  expert  wants  to 
express  that  a relation  is  only  fulfilled  if  a covered  parameter  is  ob- 
served with  exactly  the  predicted  value,  rather  than  a similar  value. 
Therefore  we  supplement  necessary  covering  relations,  disjunctive. 


conjunctive  and  constrained  covering  relations  with  the  optional  la- 
bel hounded.  We  obtain  the  required  behaviour  by  locally  defining 
the  default  similarity  measure  for  bounded  relations: 

sim(Valn{A),  Valj^^(A))  = 5v,d.n(A),vaij;^(A)- 

I.e.,  only  if  a parameter  A is  observed  with  the  predicted  value,  then 
1 is  assigned  to  its  precision. 

4 Constraints  for  Hypothesis  Generation 

As  mentioned  in  the  introduction  of  Section  2,  the  problem  of  hy- 
pothesis generation  is  exponential,  since  for  n diagnoses  we  need  to 
consider  about  2”  hypotheses  in  the  worst  case  for  an  observation. 
In  the  following  we  want  to  sketch  some  heuristics  to  restrict  the  hy- 
pothesis space. 

In  a first  step,  we  will  filter  all  diagnoses  D 6 fti>,  that  are  rele- 
vant, i.e.  having  the  minimum  precision.  For  this,  we  define  the  set 
of  relevant  diagnoses 

ns'  = {Z2  6 fix,  I 7t(D)  > c{D)}. 

Then,  only  diagnoses  D € iVo  "'ill  be  taken  into  account,  when 
generating  hypotheses.  Before  describing  concepts  to  shrink  the  set 
of  hypotheses,  we  will  define  generators  as  a compact  representation 
for  sets  of  hypotheses,  which  had  been  introduced  by  Reggia  et  al. 
[1]. 

Definition  4.1  (Generator)  A generator  Qi  = {Gi, . . . , G„}  con- 
sists of  non-empty  pairwise-disjoint  subsets  Gi  C ng'  The  hypothe- 
ses Hg,  generated  by  Qi  is  defined  as 

Ugj  = { W C (2-d  I l-H  n Gi|  < 1,  for  all  1 < i < n }. 

For  Qi  = 0,  it  holds  that  Wgj.  = {0}.  We  can  see,  that  T-lg,  is 
analogous  to  a cartesian  set  product. 

For  example,  for  the  set-covering  model  defined  in  Figure  1 and 
To  = {temp  : inc,  skin  : sweat,  nose  : red},  we  obtain  Q = 
{Qi,Q2}  with  Qi  = {{cold},  {fever}}  and  Q2  = {{flu}}.  So  we 
can  compute  Hg  = {0,  {cold},  {fever},  {cold,  fever},  {flu}}  to  be 
the  set  of  interesting  hypotheses. 

A method  for  computing  and  updating  generator  sets  is  extensively 
described  in  [4].  Generators  are  used  to  efficiently  generate  hypothe- 
ses in  an  incremental  manner:  In  a first  step,  sets  of  generators  de- 
scribing higher  level  diagnoses  (concepts)  are  created.  For  hypothe- 
ses containing  higher  level  diagnoses  and  having  a high  quality  mea- 
sure, we  build  sets  of  generators  containing  underlying  specialized 
diagnoses  and  test  them  with  their  corresponding  quality  measure. 
In  the  following,  we  introduce  two  basic  knowledge  extension,  that 
additionally  shrink  the  space  of  generated  hypotheses. 

4.1  Exclusion  Constraints 

We  can  define  exelusion  constraints  to  filter  diagnoses  from  the  pro- 
cess of  hypotheses  generation.  In  general,  two  kinds  of  constraints 
are  possible: 

-^{D  A Fi  A • • • A F„) 

If  findings  Fi, ...  ,F„  are  observed,  then  remove  generated  hy- 
potheses, containing  diagnosis  D. 


-i{Di  A • • • A Dm) 

Remove  generated  hypotheses,  containing  all  the  diagnoses 
Di, . . . , Dm  at  the  same  time. 

Thus,  we  create  hypotheses  using  generator  sets  and  check  each  gen- 
erated hypothesis  against  the  available  exclusion  constraints.  If  one 
exclusion  constraint  evaluates  true,  the  hypothesis  is  discarded. 

It  it  worth  noticing,  that  the  modification  of  generator  sets  with  re- 
spect to  exclusion  constraints  yields  a combinatorial  size  of  gener- 
ators and  therefore  is  not  reasonable.  An  evaluation  of  the  gener- 
ated hypotheses  according  to  existing  exclusion  constraints  has  been 
proven  to  be  more  efficient. 

4.2  Necessary  Covering  Relations 

A stronger  type  of  covering  relations  are  necessary  covering  rela- 
tions. A necessary  covering  relation  between  a diagnosis  D and  a 
finding  Fi  means,  that  D necessarily  covers  Fi  and  that  Fi  always 
has  to  be  observed  if  D is  hypothesized.  We  depict  a necessary  cov- 
ering relation  with  D lAf  ^5  shown  in  Figure  6. 


Figure  6.  Necessary  Covering  relation  for  a diagnosis  D. 

For  applying  necessary  covering  relations  we  introduce  an  adapted 
definition  of  the  precision  TTnoc  for  each  diagnosis  D 6 f)-p: 

iO,  if  3 r € fl-n  : r = D F with 

F € Qx- A 7t(F)  < r 

it{D),  otherwise 

where  t € [0, 1]  is  a specified  threshold,  which  defines  when  a find- 
ing is  sufficiently  observed  (e.g.  t = 0.8). 

Therefore  a diagnosis  D does  not  propagate  any  contribution  to  its 
parent  states  until  all  necessarily  covered  findings  are  (sufficiently) 
observed.  Consequently,  D will  not  appear  in  any  generator  and  thus 
will  not  be  included  in  any  hypothesis. 

5 Conclusions  and  Future  Work 

After  describing  the  basic  structures  of  set-covering  relations  we 
have  shown  how  to  enrich  the  model  with  additional  knowledge  like 
similarities  or  weights.  We  also  considered  the  computation  of  qual- 
ity measures  of  these  parts.  Furthermore,  we  have  shown  represen- 
tational extensions  to  the  set-covering  model  to  facilitate  necessary, 
disjunctive,  conjunctive  or  constrained  covering  relations.  An  impor- 
tant characteristic  of  all  these  extensions  is  the  incrementality:  some 
enhancements  can  be  added  to  refine  special  aspects  of  the  model 
but  will  not  change  its  basic  semantics;  others  are  used  to  guide  the 
process  of  candidate  generation. 


In  the  future  we  are  planning  to  work  on  the  following  fields:  In- 
cremental development  requires  restructuring  the  model  from  time 
to  time.  We  are  currently  working  on  restructuring  methods  for  set- 
covering models  that  do  not  alter  the  basic  semantics  but  improve  the 
design  of  the  diagnosis  knowledge.  In  software  engineering  refactor- 
ing [10,  11]  has  been  emerged  as  the  corresponding  method.  In  gen- 
eral we  have  to  look  at  validation  techniques  for  set-covering  mod- 
els besides  simple  case  testing.  Because  of  the  special  structure  of 
the  model  we  alsoj  have  to  consider  static  verification  techniques  for 
the  set-covering  representation.  For  a survey  in  this  field  we  refer  to 
[12,  13,  14,  15]. 

In  this  paper  we  presented  a hand-driven  development  of  set-covering 
models.  But  it  seems  to  be  possible  to  learn  coarse  models  automat- 
ically from  a small  number  of  available  cases.  Later  on  these  models 
should  be  refined  by  the  developer  with  additional  knowledge.  With 
such  a semi-automatic  development  step,  the  initial  costs  of  knowl- 
edge acquisition  can  be  reduced  conveniently.  Some  work  in  this  field 
has  been  done  by  Thompson  et  al.  [16]  and  Wang  et  al.  [17].  This  step 
is  not  considered  if  we  have  a sufficiently  large  set  of  data,  since  then 
traditional  machine  learning  methods  (e.g.  learning  neural  networks, 
learning  Bayes  networks)  seem  to  be  more  appropriate. 
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